Skip to content

Refactor storage#12840

Open
stsewd wants to merge 5 commits intomainfrom
refactor-storage
Open

Refactor storage#12840
stsewd wants to merge 5 commits intomainfrom
refactor-storage

Conversation

@stsewd
Copy link
Member

@stsewd stsewd commented Mar 11, 2026

Currently:

  • We have mix of file system storage with cloud storage methods mixed together
  • Not all storage backends support the same features. So, we can't use the methods from build media on other storage backends.
  • We are subclassing just to override some values

So, I'm refactoring our storage classes to:

  • Have a common interface, this interface needs to be implemented for local and cloud storage classes.
  • All storage classes share this same interface, so we can use the same methods in all of them.

Some other changes:

  • copy_directory/walk was no longer used, so it was deleted (it was replaced by rclone sync).
  • get_available_name on the cloud storage class doesn't need overriding, as the storage backend now allows overriding files (true by default).
  • get_available_name on the file system storage class doesn't need overriding, as the storage backend now supports overriding files (false by default, but it's passed as an extra option in STORAGES).

This is just the start to make sure nothing breaks. After this I want:

  • Move more options to be declared in STORAGES instead of subclassing.
  • Make our tests rely less on the local storage classes
  • Re-implement delete_directory to make use of the delete prefix S3 API. Currently, we are deleting file by file, which is slow, and generates lots of API calls.

@stsewd stsewd marked this pull request as ready for review March 12, 2026 20:28
@stsewd stsewd requested a review from a team as a code owner March 12, 2026 20:28
@stsewd stsewd requested a review from humitos March 12, 2026 20:28
Copy link
Member

@humitos humitos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of the refactor and the next steps in particular. I know that we had some issues with get_available_name in the past (see #11505). We need to make sure this is working as expected.

Storage is key in our infra and I'm a little worried here about breaking something here, so I'm requesting review from another dev here as well just to make sure we are not missing anything.

@stsewd stsewd moved this to Needs review in 📍Roadmap Mar 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Needs review

Development

Successfully merging this pull request may close these issues.

2 participants